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Abstract 

Recent work by Erik Volz [12] has shown how to calculate the growth 
and eventual decay of an SIR epidemic on a static random network, assum- 
ing infection and recovery each happen at constant rates. This calculation 
allows us to account for effects due to heterogeneity in degree that are ne- 
glected in the standard mass-action SIR equations. In this note we offer 
an alternate derivation which arrives at a simpler — though equivalent — 
system of governing equations to that of Volz. This new derivation is more 
closely connected to the underlying physical processes, and the resulting 
equations are of comparable complexity to the mass-action SIR equations. 
We further show that earlier derivations of the final size of epidemics on 
networks can be reproduced using the same approach, thereby providing 
a common framework for calculating both the dynamics and the final size 
of an epidemic spreading on a random network. 

1 Introduction 

Infectious diseases are constrained to spread along the contacts of a population. 
Mathematical models investigating epidemics typically assume that the contacts 
occur through mass action mixing [SJ [T] . However true populations violate some 
mass-action assumptions in a manner affecting the epidemic dynamics. Recently 
a number of investigations have been performed using random networks El 
El lll[ [7] which allow for a better accounting of mixing in the population. 

Unlike mass-action models, random networks allow for the number of con- 
tacts individuals have to remain bounded as the population size increases. Thus 
once an individual infects a contact, the number of available contacts to infect 
decreases by a non- negligible amount. Random networks also allow for more 
accurate representation of heterogeneities in the number of contacts compared 
with mass-action models. In a population with heterogeneous contact levels, 



individuals with more contacts are preferentially infected early in the epidemic 
(and in turn cause more infections), while at the end of the epidemic the re- 
maining susceptibles tend to have fewer contacts. 

A number of analytic results have been found for epidemic probability or 
size in random networks, but with only a few exceptions (notably 112)), 
no analytic attention has been paid to the dynamics of the growth in networks. 
However, some attempts have been made using pair approximations which track 
the number of joined pairs of individuals with fej contacts and k,2 contacts in 
each infection state [3] (assuming infection and recovery occur at constant rates) . 
For a network with n different degrees, such a model results in 0(n 2 ) coupled 
differential equations. 

Recent work by [T^] has shown that it is possible to investigate the dynamics 
of epidemic spread on Configuration Model networks (described below) using 
a coupled system of only three ODEs (again assuming infection and recovery 
occur at constant rates). The resulting system has many nonlinear terms, but 
the number of equations does not grow with the number of different degrees. In 
this note we derive a single differential equation that can capture the dynamics 
with only a single higher order term. The framework we develop to calculate 
the dynamics can also be applied to predicting the final size of an epidemic in 
a concise way. We reproduce earlier results in this context. 

Although our results are equivalent to pre-existing results, we place previous 
calculations of epidemic size and epidemic dynamics into a common framework. 
The equations we derive are simpler, and the terms in the equations are more 
easily interpreted. The resulting calculations for the numbers of susceptible, 
infected, and recovered individuals are of comparable complexity to the standard 
mass-action SIR equations, but allow for more realistic population interactions. 

In section [2] we develop the framework for the later sections. In section [3] we 
apply this framework to calculating the time course of an epidemic. In section [4] 
we apply this framework to calculating the final size of an epidemic. Finally in 
section [5] we discuss the significance of these calculations. 

2 The framework 

We represent the population by a network. Each individual is thought of as a 
node joined to other nodes by edges through which disease can spread. We use 
Configuration Model (CM) networks [TU] to model the population. To generate 
a CM network, the degree or number of edges of each node, k, is assigned with 
probability P(k) based on a given degree distribution. If the sum of degrees 
is odd, all degrees are reassigned until the sum is even. Then each node is 
placed into a list with repetition equal to its degree, the list is randomized, and 
each node in position In {n = 0, 1, . . .) is connected with the node in position 
2n + 1 . The resulting network constitutes a uniform choice from the networks 
with the given degree distribution. In general the network may have self-loops 
or repeated edges. For degree distributions with finite mean, the impact of this 
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Figure 1: A sample Configuration Model network with 70 nodes. The degree 
distribution is chosen such that P(3) — 0.5 and P(l) = 0.5. Thus ip(x) — 
(x 3 + x)/2. 



effect is negligible in sufficiently large networks and we ignore it. We define 

oo 
fe=0 

the probability generating function of the degree distribution. Note that ^'(1) = 
(K) is the average degree. An example CM network is shown in figure [I] For 
many important distributions, ip takes a simple form; for example, a Poisson 
distribution with parameter A has ip(x) = e A ' x_1 ^. 

Nodes in the network are assigned to one of three classes: susceptible, in- 
fected, or recovered. We denote the fraction of the population in each class by 
S, I, and R respectively. A susceptible node becomes infected at rate n(3 where 
n is the number of infected neighbors it has. Once infected, a node recovers 
at rate 7. A recovered node plays no further role in the spread. Typically an 
outbreak is initiated with a single randomly chosen infected indvidual in an 
otherwise susceptible population. 

We define an infectious contact from v to its neighbor u to be a contact when 
v is infectious that would cause infection of u if u were susceptible. Physically 
this is the transmission of an infectious dose from d to 11. An individual can 
cause infectious contact only when infected. However, an individual can receive 
an infectious contact regardless of his/her state, and so an infectious contact 
does not necessarily lead to infection. 

We use 9 as a measure of the probability that a random edge has not trans- 
mitted an infectious contact. Its precise definition is subtle, but important. To 
define 9, we choose an edge uniformly from all edges. We then choose a direction 
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for that edge, say from v to u. We refer to v as the base and u as the target. We 
modify the spread of the disease by disallowing infectious contacts from u to v. 
Then 6*(oo) is the probability that there is never an infectious contact from v 
to u, while 9(t) is the probability that at time t there has not been infectious 
contact from v to u. If we did not disallow infection from the target then an 
infection of u from some other source would in turn make infection of v more 
likely which in turn makes infectious contact from v to u more likely and so 
transmission along different edges to the same target would not be independent, 
thereby complicating the analysis. 

Under the assumption that the spread is deterministic, the cumulative size 
of an epidemic at a given time is equal to the probability a randomly chosen 
node has been infected. Disallowing infection from that single randomly chosen 
node may impact the dynamics after that node is infected, but it does not 
modify the probability that that single node has become infected. Consequently, 
to calculate the size at a given time, it suffices to calculate the probability a 
randomly chosen node that cannot infect its neighbors has been infected, or 
alternately, is still susceptible. 

3 Dynamics 

To calculate the dynamics, we calculate the fraction of the population that has 
not yet been infected. To do this, we look at the probability that a randomly 
chosen node is not yet infected at time t. We choose a random target u and 
disallow infection from u to all of its neighbors. Using 9 as defined above, if the 
degree of u is k, then the probability that u is still susceptible is 9{t) k . Thus 
the fraction of susceptibles is 



To calculate the rate of change of 9, we will need to know how many of those 
edges that have not transmitted an infectious contact have the opportunity to 
transmit infection at any given time. That is, we need to know what proportion 
of all edges have not had an infectious contact but come from an infected base 
node. We set <fi to be the probability that the base node of an edge is infected 
but the edge has not transmitted infection (assuming as for 8 that the target 
node does not cause infection). Those edges which satisfy the definition for <\> 
are a subset of those which satisfy the definition for 8. 

We derive coupled differential equations for 9 and (f>. The rate of change in 
the probability a random edge has not transmitted infection is equal to the rate 
at which infection crosses edges 



An edge no longer satisfies the definition for </> when infection crosses the edge or 
when the base node recovers. An edge from v to u begins to satisfy the definition 




(1) 



k=0 



= -[3<j> 



(2) 
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if v becomes infectious. The rate at which neighbors become infectious matches 
the rate at which neighbors stop being susceptible. We use h(t) to denote the 
probability that a neighbor is susceptible, so <j) — ~{P + 7)0 — (d/dt)h(t). 

We now find hit). A node is more likely to be a neighbor if it has more 
contacts [1], and so the probability the neighbor has degree k is kP(k)/ (K). 
The neighbor can only be infected by an edge other than the one from the target 
node. Thus 

E^oWW- 1 _ no) 



h(t) 

;hbor b< 

We finally get 



(K) V'(l) ' 

Thus the neighbor becomes infectious at rate —(d/dt)h(t) — f3<j)ip"(0)/ip'(l). 



(3) 



1^(1). 

In fact, we can integrate this equation using ^ to get 

♦ -.-(.-fl-Jp-D-^. 

The term 1 — 6 represents the probability the edge has transmitted an infectious 
contact, the term (7//?) (1 — 0) represents the probability that the base node has 
been infected but recovered without an infectious contact, and ip' (9) / ip' (1) rep- 
resents the probability that the base node is still susceptible. The complement 
of all such edges is exactly those edges which have not transmitted infection but 
connect to an infected base node. Consequently we arrive at 

6 = -W + T(l-8)+0^&. (4) 

The epidemiological quantity of interest is only rarely the proportion of edges 
which have or have not transmitted infection, but rather it is usually the values 
of S, I, and R. We can calculate S(t) — tp(8(t)) directly. It is not difficult 
to show that R = 7/, and conservation of individuals gives I = 1 — S — R. 
Consequently, we can augment Q with 

R = jl, 

s = m , 

I = l-R-S. 



to find S, I, and R. 

In order to solve our equations, we need to find appropriate initial con- 
ditions. At the earliest stages, the outbreak grows stochastically, and so the 
deterministic equations are not yet appropriate. If an epidemic occurs, eventu- 
ally the outbreak infects a large number of nodes and then behaves effectively 
detcrministically. In a sufficiently large population we can assume that deter- 
ministic behavior begins while the proportion infected is still small compared 
to the population. 
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Figure 2: Plots of cumulative infections I + R against time. Predicted epidemic 
dynamics (thick, broken curves) and final sizes compared with simulations (solid 
curves) in CM networks of 500 000 individuals with (3 = 1.3 and 7 = 2. (a) Every 
node has degree 4. (b) Poisson degree distribution of mean 4. (c) A bimodal 
distribution: P(l) = 5/12, P(2) = 1/12, P(6) = 1/12, and P(7) = 5/12. (d) 
A truncated powerlaw with P(fc) oc e~ fe / 40 /c~ 2 - 5 . 



Once the stochastic phase is over, we have 6 = 1 — e with e -C 1. At 
early time e tx exp[(— j3 — 7 + unless ^"(1) is infinite (which 

corresponds to an infinite variance in the degree distribution such as occurs in 
some power-law distributions). For simplicity we assume the tp"(X) is finite (if it 
were not, growth would not be exponential initially and this calculation would 
require more attention). We define t = Q to correspond to a time when the 
epidemic is sufficiently large that the outbreak proceeds deterministically, but 
the proportion affected is still small. From the value of 9 we can easily calculate 
S(t) = ip(8(t)), and thus we can also calculate / + R. 

To distinguish the number of current infections (/) from recovered infec- 
tions (R) requires somewhat more effort. To find the early behavior for R, we 
note that / and e are linearly related at early time, so that I oc exp[(— (3 — 
7 + 0ip"(l)/ip'(l))t]. Then R = 7/ gives R = -yI/[-0 - 7 + /3^"(1)/^'(1)]. 
Combined with I + R= l — S = l — — e) this gives R at t = 0. 

We show a comparison of simulation with results calculated using equa- 
tion Q in figure [2] The results show good agreement, except for time shifts 
resulting from stochastic effects in the simulations while the outbreak size is 
still small. 
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3.1 Discussion 



Equation Q contrasts with the original system of [12] which uses three equa- 
tions. In addition to the variable 6, the system of [T^] uses pi = <j)/Q (the 
probability that an edge is connected to an infected node given that it has not 
transmitted infection to the target node) in place of 4> and an additional variable 
Ps (the probability that an edge is connected to a susceptible node given that 
it has not transmitted infection to the target node): 



-0Pi 



Pi = Pi 



Ppsd 
Ps = PpsPi 



no) 



09 + 7) + PPi 



We have replaced this system by the single equation Q with only one higher 
order term. To see that these systems are equivalent, we note the ps equation 
can be eliminated by observing that the probability the neighbor has not been 
infected is ip' (8) / vp' (1) and so ps — ip' (6) / Oip' '(1) . Equation ^ can be modified 
by using ^'(l) = 4>'{Q)IOps an d 4> = @Pl to arrive at the same pi equation. 



4 Final epidemic size 

We now reproduce some of the earliest results for epidemics on networks 
El 12 by calculating the final size of epidemics (under the assumption that the 
outbreak does not die out during the stochastic phase). We can find this by 
solving equation Q for 8 = 0. However, this approach is unnecesarilly specific 
and we can easily generalize to disease processes that do not depend on constant 
infection and recovery rates by calculating 9(oo) directly rather than through 
equations for the intermediate dynamics. To simplify notation in this section 
we use f?oo to represent 9(oo) as we are not interested in the epidemic state at 
intermediate time. 

To calculate the epidemic size, we look for the probability that a randomly 
chosen node u is never infected. If a node has degree k, then the probability 
that it is never infected is From this we get 

oo 

5H^P(fc)^^(U. (5) 

fc=0 

We must calculate 6^. We set T = J °° 1 e~^ T {l - e~^ T ) dr = /3/( 7 + (3). This 
is the probability that a randomly chosen neighbor has an infectious contact 
with u given that the neighbor becomes infected. If h is the probability that the 
neighbor does not become infected (given that u does not transmit infection), 
the probability of infectious contact is T(l — h). Thus the probability of not 
transmiting is 

Ooo = 1 - T(l - h) = 1 - T + Th . 
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An argument in the previous section shows that h — , 0'(^ oo )/^' / (l), and so 6 
solves the implicit relation 



DC 



*„ = i-r + r^. (6) 

Using ([5]) and ([6| together gives S(oo), and the final size of an epidemic is simply 
1-5(00). 

Note that the ability of a base node to infect a neighbor depends on duration 
of infection and whether the base node becomes infected. Consequently, infec- 
tious contacts along different edges out of the same node are not independent 
events (they both depend on the base node's properties). However, this does 
not affect our calculations because infectious contacts along different edges into 
the same node are independent events. If there were variation in susceptibility, 
more work would be needed [8]. Also the independence assumption will fail if 
short cycles are not negligible because infection of one neighbor is correlated 
with infection of another. 



5 Discussion 

We have shown that calculations for both the final size and the dynamics of an 
epidemic on a random network can be placed into a common framework. This 
framework allows us to simplify previous calculations of the dynamics [T^] . Our 
calculations match closely to simulations, except for time shifts that result from 
stochastic effects when the infected population is still small. Our model is of 
similar complexity to the standard mass-action SIR equations. 

The assumption that the network is a Configuration Model network is cen- 
tral to this derivation. If there is a tendancy for high degree individuals to 
preferentially contact high degree individuals, these approaches do not directly 
apply. Similarly the presence of many short cycles will also affect these calcula- 
tions. When a short cycle exists, whether or not one neighbor of the target node 
is still susceptible may no longer be independent of whether another neighbor 
is still susceptible. 
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